Kernel Density Estimation

Adrija Srijani Yenisi

INTRODUCTION

  • In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on kernels as weights. \[ {\widehat {f}}_{n}(x)={\frac {1}{nh_n}}\sum _{i=1}^{n}K{\Big (}{\frac {x-x_{i}}{h}}{\Big )} \]
  • Project Objective: To systematically investigate and illustrate how variations in n, h, and kernel choice shape the accuracy and characteristics of KDEs, providing insights for optimal parameter selection.

Kernel

    Let \(K(.)\) be a pdf on the real line that satisfies
  • \[sup_{x \in \mathbb{R}} K(x) \leq M, |x|K(x) \to 0 \hspace{0.5cm} \text{as} \hspace{0.5cm} x \to \infty\]

  • \(K(x)=K(-x)\)

  • \(\int x^2K(x)dx < \infty\)

Estimating Normal using different kernels

Estimating Cauchy using different kernels

Estimating Exponential using different kernels

Estimating Weibull using different kernels

Estimating Mixture using different kernels

Simulation study using varying sample size for different kernels and fixed bandwidth

KDE of Normal using Box Kernel

KDE of Normal using Box Kernel

KDE of Normal using Epanechnikov Kernel

KDE of Normal using Epanechnikov Kernel

KDE of Normal using Logistic Kernel

KDE of Normal using Tricube Kernel

KDE of Normal using Cosine Kernel

KDE of Normal using Exponential Kernel

KDE of Normal using Cauchy Kernel

KDE of Cauchy using Naive Estimator

KDE of Cauchy using Epanechnikov Kernel

KDE of Cauchy using Gaussian Kernel

KDE of Cauchy using Logistic Kernel

KDE of Cauchy using Tricube Kernel

KDE of Cauchy using Cosine Kernel

KDE of Exponential using Box Kernel

KDE of Exponential using Epanechnikov Kernel

KDE of Exponential using Logistic Kernel

KDE of Exponential using Gaussian Kernel

KDE of Exponential using Tricube Kernel

KDE of Exponential using Cosine Kernel

KDE of Weibull using Box Kernel

KDE of Weibull using Epanechnikov Kernel

KDE of Weibull using Gaussian Kernel

KDE of Weibull using Logistic Kernel

KDE of Weibull using Tricube Kernel

KDE of Weibull using Cosine Kernel

KDE of Binomial using Box Kernel

KDE of Binomial using Epanechnikov Kernel

KDE of Binomial using Gaussian Kernel

KDE of Binomial using Logistic Kernel

KDE of Binomial using Cosine Kernel

KDE of Binomial using Tricube Kernel

KDE of Poisson using Box Kernel

KDE of POisson using Epanechnikov Kernel

KDE of POisson using Gaussian Kernel

KDE of Poisson using Logistic Kernel

KDE of Poisson using Cosine Kernel

KDE of Poisson using Tricube Kernel

KDE of 0.5*N(-2,1)+0.5*N(2,1) using Box Kernel

KDE of 0.5*N(-2,1)+0.5*N(2,1) using Epanechnikov Kernel

KDE of 0.5*N(-2,1)+0.5*N(2,1) using Gaussian Kernel

KDE of 0.5*N(-2,1)+0.5*N(2,1) using Logistic Kernel

KDE of 0.5*N(-2,1)+0.5*N(2,1) using Tricube Kernel

KDE of 0.5*N(-2,1)+0.5*N(2,1) using Cosine Kernel

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Fix sample size varying bandwidth

Glivenko Cantelli Lemma

  • Let \(X_i, i = 1, \ldots, n\) be an i.i.d. sequence of random variables with distribution function \(F\) on \(\mathbb{R}\). The empirical distribution function is the function of \(x\) defined by \[F_n(x) = \frac{1}{n} \sum_{1≤i≤n} I\{X_i ≤ x\}\]
  • Then, \[\sup_{x∈\mathbb{R}}|F_n(x)-F(x)| \rightarrow 0 \text{ a.s.}\]

Glivenko Cantelli Type Result

  • Suppose \(K(.)\) is a function of bounded variation and series \(\sum_{n=1}^{\infty}e^{-rnh_n^2}\) converges for any (r>0). Then \[\sup_{x∈\mathbb{R}}|f_n(x)-f(x)| \rightarrow 0 \text{ a.s.}\] as \(n \to \infty\) iff density is uniformly continuous.

Sample: Logistic using Epanechnikov Kernel

Animated GIF

Sample: Logistic using Gaussian Kernel

Animated GIF

Sample: Logistic using Logistic Kernel

Animated GIF

Sample: Logistic using Box Kernel

Animated GIF

Sample: Cauchy using Epanechnikov Kernel

Animated GIF

Sample: Cauchy using Gaussian Kernel

Animated GIF

Sample: Cauchy using Logistic Kernel

Animated GIF

Sample: Cauchy using Box Kernel

Animated GIF

Sample: Exponential using Epanechnikov Kernel

Animated GIF

Sample: Exponential using Gaussian Kernel

Animated GIF

Sample: Exponential using Logistic Kernel

Animated GIF

Sample: Exponential using Box Kernel

Animated GIF

Sample: Normal using Epanechnikov Kernel

Animated GIF

Sample: Normal using Gaussian Kernel

Animated GIF

Sample: Normal using Logistic Kernel

Animated GIF

Sample: Normal using Box Kernel

Animated GIF

Checking for Asymptotic Normality

  • Let \(X_i, i = 1, \ldots, n\) be an i.i.d. sequence of random variables. Let \(f_n(x)=\frac{1}{nh_n} \sum_{i=1}^{n}K(\frac{x-X_i}{h_n})\). Under regularity conditions, \[ \frac{f_n(x)-E(f_n(x))}{\sqrt{Var(f_n(x))}} \rightarrow \ {\mathcal {N}}(0,1).\]

Sample: Logistic(0,1) at 0.05th quantile

Animated GIF

Sample: Logistic(0,1) at 0.05th quantile

Animated GIF

Sample: Logistic(0,1) at 0.05th quantile

Animated GIF

Sample: Logistic(0,1) at 0.1th quantile

Animated GIF

Sample: Logistic(0,1) at 0.1th quantile

Animated GIF

Sample: Logistic(0,1) at 0.1th quantile

Animated GIF

Sample: Logistic(0,1) at 0.25th quantile

Animated GIF

Sample: Logistic(0,1) at 0.25th quantile

Animated GIF

Sample: Logistic(0,1) at 0.25th quantile

Animated GIF

Sample: Logistic(0,1) at 0.5th quantile

Animated GIF

Sample: Logistic(0,1) at 0.5th quantile

Animated GIF

Sample: Logistic(2,1) at 0.5th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.05th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.05th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.05th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.1th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.1th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.1th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.25th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.25th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.25th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.5th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.5th quantile

Animated GIF

Sample:0.5*N(2,1) + 0.5*N(-2,1) at 0.5th quantile

Animated GIF

Sample: Weibull(shape=1, scale=2.5) at 0.05th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.05th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.05th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.1th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.1th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.1th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.25th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.25th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.25th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.5th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.5th quantile

Animated GIF

Sample:Weibull(shape=1, scale=2.5) at 0.5th quantile

Animated GIF

THANK YOU